Variations of k-mean Algorithm: A Study for High-Dimensional Large Data Sets

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters

This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computi...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

An approximate algorithm for top-k closest pairs join query in large high dimensional data

In this paper we present a novel approximate algorithm to calculate the top-k closest pairs join query of two large and high dimensional data sets. The algorithm has worst case time complexity OðdnkÞ and space complexity OðndÞ and guarantees a solution within a Oðd1þ1tÞ factor of the exact one, where t 2 {1,2, . . . ,1} denotes the Minkowski metrics Lt of interest and d the dimensionality. It m...

متن کامل

O-Cluster: Scalable Clustering of Large High Dimensional Data Sets

Clustering large data sets of high dimensionality has always been a serious challenge for clustering algorithms. Many recently developed clustering algorithms have attempted to address either handling data sets with very large number of records or data sets with very high number of dimensions. This paper provides a discussion of the advantages and limitations of existing algorithms when they op...

متن کامل

Clustering of Data Using K-Mean Algorithm

Clustering is associate automatic learning technique geared toward grouping a collection of objects into subsets or clusters. The goal is to form clusters that are coherent internally, however well completely different from one another. In plain words, objects within the same cluster ought to be as similar as potential, whereas objects in one cluster ought to be as dissimilar as potential from ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Technology Journal

سال: 2006

ISSN: 1812-5638

DOI: 10.3923/itj.2006.1132.1135